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Abstract 

We exploit the separation of the filtering and control aspects of quan- 
tum feedback control to consider the optimal control as a classical stochas- 
tic problem on the space of quantum states. We derive the corresponding 
Hamilton-Jacobi-Bellman equations using the elementary arguments of 
classical control theory and show that this is equivalent, in the Stratonovich 
calculus, to a stochastic Hamilton-Pontryagin setup. We show that, for 
cost functionals that are linear in the state, the theory yields the tradi- 
tional Bellman equations treated so far in quantum feedback. A controlled 
qubit with a feedback is considered as example. 

1 Introduction 

When engineers set about to control a classical system, they can evoke the 
celebrated Separation Theorem which allows them to treat the problem of es- 
timating the state of the system (based on typically partial observations) from 
the problem of how to optimally control the system (through feedback of these 
observations into the system dynamics), see for instance |13|. Remarkably, as it 
was pointed out for the first time in this is also true when trying to control 
the quantum world, see also [H|,[E],E2- To begin with, the very act of measure- 
ment itself never supplies anything but incomplete information about the state 
of a system and, as is well known, alters the state in process. However, provided 
we use a non-demolition principle 3 when measuring the system over time, we 
can apply a filter scheme for state estimation continuously in time. The general 
theory of the continuous in time nondcmolition measurements and filtering was 
developed by Belavkin in [3] , |S] , |S] , [7] , however we will use here its final result 
for a simple quantum diffusion model described by the quantum state filtering 
equation with a single white noise innovation, see e.g. @]>EiI]>C2- We should 
emphasize that the continuous-time filtering theory for this case can be obtained 
as the limit of a discrete-time measurements where nothing beyond the standard 
von Neumann projection postulate is used [IS|,|2n], |^5| . |26|. Once the filtered 
dynamics is known, the of optimal feedback control of the system can then be 
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formulated as a distinct problem. Modern experimental physics has opened up 
unprecedented opportunities to manipulate the quantum world, and feedback 
control has been already been successfully implemented for real physical sys- 
tems P,5H1- Currently, these activities have attracted interest in the related 
mathematical issues such as stability, observability, etc., [TT ] . [2T ] . [T5 ] . [22 ] . 

The separation of the classical world from the quantum world is, in practice, 
the most notoriously troublesome task faced in modern physics. At the very 
heart of this issue is the very different meanings we attach to the word state. 
What we want to remark upon, and exploit, is the fact that the separation 
of the control problem from the filtering gives us just the required separation 
of classical from quantum features. By the quantum state we mean the von 
Neumann density matrix which yields all the (stochastic) information available 
about the system at the current time - this we also take to be the state in the 
sense used in control engineering. All the quantum features are contained in this 
state, and the filtering equation it satisfies may then to be understood as classical 
stochastic differential equation which just happens to have solutions that are 
von Neumann density matrix valued stochastic processes. The ensuing problem 
of determining optimal control may then be viewed as a classical problem, albeit 
on the unfamiliar state space of von Neumann density matrices rather than the 
Euclidean spaces to which we are usually accustomed. Once we get used to this 
setting, the problem of dynamical programming, Bellman's optimality principle, 
and so on, can be formulated in the same spirit as before. 

We shall consider optimization for cost functions that are non-linear func- 
tionals of the state. Traditionally quantum control has been restricted to linear 
functions where - given the physical meaning attached to a quantum state - the 
cost functions are therefore expectations of certain observables. In this situa- 
tion, which we consider as a special case, we see that the distinction between 
classical and quantum features may be blurred: that is, the classical information 
about the measurement observations can be incorporated as additional random- 
ness into the quantum state. This is the likely reason why the separation does 
not seem to have been taken up before. 



2 Notations 

The Hilbert space for our fixed quantum system will be a complex, separable 
Hilbert space f) . We shall use the following spaces of operators: 

A = 58 (f)) - the Banach algebra of bounded operators on f); 
A* = 3 (t)) - the predual space of trace-class operators on f); 
S = 6 (f)) - the positive, unital trace operators (states) on I). 

The space A* equipped with the trace norm \\g\\ 1 = tr \g\ is the complex Banach 
space, the dual of which is identified with the algebra A with usual operator 
norm. The natural duality between the spaces A* and A is indicated by 

(q,X) :=tr{gX}, (1) 
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for each g £ A*, X £ A. The positive elements of A* normalized as \\q\\i = 1 
are called normal states, and the extremal elements g £ S of the convex set 
S C A* correspond to pure quantum states. The symmetric tensor power 
•Afy m = A^sym A of the algebra A is the subalgebra of 05 (f)® 2 ) of all bounded 
operators on the Hilbert product space f)® 2 = f)®f), commuting with the unitary 
involutive operator S — of permutations rj 1 <S> rj 2 i— > r) 2 <8> rj 1 for any ry i G t). 

A map C (•) from A — 03 (t)) to itself is said to be a Lindblad generator if it 
takes the form 

C(X) = -i[X,H}+^£ Ra (X), (2) 

a 

L R (X) = R*XR - -B)RX - ^XR^R (3) 

with H self-adjoint, the R a S A and R^Ru (ultraweakly convergent 
for an infinite set {R a }). The generator is Hamiltonian if it just takes form 
i [H, ■]. The preadjoint £' = £± of a generator £ is defined on the preadjoint 
space A* through the relation (£' (g) , X) = (g, C (X)). We note that Lindblad 
generators have the property C (I) = corresponding to conservation of the 
identity operator I € A or, equivalently, tr {£' (g)} = for all g £ At. 

In quantum control theory it is necessary to consider time-dependent genera- 
tors C (t), through an integrable time dependence of the controlled Hamiltonian 
H {i), or more generally due to a square-integrable time dependence of the 
coupling operators R a (t). We will always assume that these integrability con- 
ditions, corresponding to the existence of the unique solution g(t) — P t (to,g ) 
to the quantum state Master equation 

j t g(t)=C'(t,g(t)) = v(t,g(t)), (4) 

for all for t > t 0l given an initial condition g (< ) = g Q G S, arc fulfilled. 

Let F — F [•] be a (nonlinear) functional g >—» F [g] on S, then we say it admits 
a (Frechet) derivative if there exists a „4- valued function V e F [•] on A such that 

hmi{F[. + H-F[-]}-(r,V e F[.]}, (5) 

h— >0 n 

for each r G A- In the same spirit, a Hessian V® 2 = V e (8)V e can be defined as a 
mapping from the functionals on S to the Afy m := A ® S ym -4- valued functionals, 
via 

, ^ n 7^7 {F [' + hr + h'r'] - F [• + hr] - F [• + ftV] + F [•]} 

= <r®T',V e ®V e F[-]). (6) 

and we say that the functional is twice continuously diffcrentiable whenever 
V® 2 F [•] exists and is continuous in the trace norm topology. 



3 



Likewise, a functional f : X >—> f [X] on A is said to admit an ^-derivative 
if there exists an ^-valued function Vx/ [•] on A such that 

limi{/[. + / l A]-/[.]} = (Vx/[-],A) (7) 

for each A G 58 (f|). 

With the customary abuses of differential notation, we have for instance 

V e / ((g, X)) = J' ({g, X)) X, V x f ((g, X)) = f ((g, X)) g. 

Typically, we shall use V e more often, and tend denote it by 5 (as "inverse" to 
the notation g), leaving the simple notation V for Vx- 

3 Quantum Filtering Equation 

The state of an individual continuously measured quantum system does not co- 
incide with the solution g (t) of the deterministic master equation |0J but is a 
5-valued stochastic process g m (t) : u i— ► g^ (t) which depends on the random 
measurement output uo = {lo (t)} G in a causal manner. We take the out- 
put process to constitute a white noise, in which case we may work with the 
innovations process which will be a Wiener process W (t) defined in the gener- 
alized sense by f t W (t) = lo it) with W (0) = 0. The Belavkin quantum filtering 
equation in this case is 0,0, |2H]jG21 

dg, (t) =w(t,u (t) , g m (t)) dt + a (g, (t)) dW (t) (8) 

where dW (t) =W(t + dt)-W (i), the time coefficient is 

w (t, u,g) =i [g, H (t, u)] + C' R (g) + C' L (g) , (9) 

with C L (g) of the form given 

C' L (Q)=Lgtf-±gtfL-h^Lg, 

and the fluctuation coefficient is 

<r(g) = Lg+ grf - (g,L + tf) g. (10) 

Here L is a bounded operator describing the coupling of the system to the 
measurement apparatus. 

The time coefficient w consists of three separate terms: The first term is 
Hamiltonian and depends on a control parameter u belonging to some parameter 
space U which we must specify at each time; the second term is the adjoint 
of a general Lindblad generator Cr and describes the uncontrolled, typically 
dissipative, effect of the environment; the final term is adjoint to the Lindblad 
generator Cl (X) which is related to the coupling operator L. 
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The maps w and a are required to be Lipschitz continuous in all their com- 
ponents: for L constant and bounded, this will be automatic for the g-variable 
with the notion of trace norm topology. We remark that tr {a (g)} — and, by 
conservativity, tr {w (t,u, g)} — for all g e A- This implies that the normal- 
ization tr{p} is a conserved quantity tr {g m (t)} = tr {g} under the stochastic 
evolution ©. 

A choice of control function {u (t) : t G [Ti,T2]} is required before we can 
solve the filtering equation JSJ on the time interval [Ti , T 2 ] for given initial state 
at time T\. From what we have said above, this is required to be a U- valued 
function which we take to be continuous for the moment. 

Let {P r ,u (t, g) ■ r > t,u> £ f2} be the solution g, (r) = P r ^ m (t, g) to © start- 
ing in state g^ (t) — g at time r = t for all uj £ fl. This will be a Markov process 
in S (embedded in the Banach space -4*), see for instance |T2], and we remark 
that, for twice continuously differentiable functionals F on A*, we will have 

lim j {E [F [P t+K . (t, g)] - F [g]]} = D (t, u, g) F [g] , 

h— >o+ h 

where D (f , u, g) is the elliptic operator defined by 

D (i, u, g) ■ = (w (i, u, g) ,S-) + - (a (q) ® a (g) , (5 ® S) •) . (11) 
For the classical analogue of stochastic flows on manifolds, see for instance jlOj . 

3.1 Stratonovich Version 

We convert to the Stratonovich picture by means of the identity ^Sj 

a (g.) dW = <j (g.) o dW - -da (g,) .dW 
and from (|l()|l we get 

dcr (g,) = Ldg,+dg,L ji ~(dg m , L + L') Q, — (q,, L + L') dg, —(dg m , L + L'\ dg t . 

After a little algebra, we obtain the Stratonovich form of the Belavkin filtering 
equation: 

dg t = v (t, u, g m ) dt + a ((0.) o dW (12) 

where, with a = a (g), 

v (t, u, g) — w (t, u, g) — — {La + aL^ — (a, L + g — (g, L + L^) cr} 
= i[g,H (t, «)] + £' R (g) + [k (g) g + qK(q? + F (g) g} 

(13) 

where we introduce the operator-valued function 

K(g) :=-l(L + tf)L + (Q,L + tf)L (14) 
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and the scalar- valued function 

F (g) := l - (g, L 2 + 2L^L + L^ 2 ) -(g,L + L^f . (15) 

We refer to w in © and v in as the ltd and Stratonovich state velocities, 
respectively. We note that the decoherent component LgL^ appearing in C' L , 
and present in w (t, u, g), is now absent in v (t, u, g). 

The elliptical operator D (t, u, g) can then be put into Hormander form as 

D (t, u, g) (•) := (v (t, u, g) , S-) + ± (a (g) , 6 (a (g) ,5-)), (16) 
by using the equality (|13|l in the definition (|11|) . 

4 Optimal Control 

The cost for a control function {u(r)\ over any time-interval [t, T] is random, 
taken to have the integral form 

M{«(r)};<,e] =y C(r,w(r),^(r))dr + S(^(T)) (17) 

where {p. (r) : r G [i,? 1 ]} is the solution to the filtering equation with initial 
condition g t (t) — g. We assume that the cost density C and the terminal cost S 
will be continuously differentiable in each of its arguments. In fact, due to the 
statistical interpretation of quantum states, we should consider only the linear 
dependence 

C(r,u,e) = (Q,C(r,u)},S( Q ) = {Q,S) (18) 

of C and S on the state g as it was already suggested in 0,0, [E] We will ex- 
plicitly consider this case later, but for the moment we will not use the linearity 
of C and S. 

The feedback control u (t) is to be considered a random variable u w (t) 
adapted with respect to the innovation process W (t) and so we therefore con- 
sider the problem of minimizing its average cost value with respect to {it, (t)}. 
To this end, we define the optimal average cost to be 

S(t,g) := inf E [J. [{«. (r)} ; t, <?]] , (19) 

{tt.(r)} 

where the minimum is considered over all measurable adapted control strategies 
{u, (r) : r > t}. The aim of feedback control theory is then to find an optimal 
control strategy {«* (t)} and evaluate S (t, g) on a fixed time interval [to, T]. Ob- 
viously that the cost S (t, g) of the optimal feedback control is in general smaller 
then the minimum of E [J. [{u} ; t, g}] over nonstochastic strategies {u (r)} only, 
which gives the solution of the open loop (without feedback) quantum control 
problem. In the case of the linear costs (|18fl this open-loop problem is equivalent 
to the following quantum deterministic optimization problem which can be tack- 
led by the classical theory of optimal deterministic control in the corresponding 
Banach spaces. 
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4.1 Bellman & Hamilton-Pontryagin Optimality 

Let us first consider nonstochastic quantum optimal control theory assum- 
ing that the state g(t) G S obeys the master equation 10} where v (t, •) = 
£' (t, u (t) , •) is an adjoint of some Lindblad generator £' (i, u, •) = v (t, u, •) for 
each t and u with, say, the control being exercised in the Hamiltonian component 
i[-,H (i, u)] as before. The control strategy {u (t)} will be here non-random, as 
will be any specific cost J[{u} ; to, g ]- For times t < t + e < T, one has 



s (*,<?) = inf I [ t+e < 

M [Jt 



C(r,g(r),u(r))dr + / C (r, g(r),u (r)) dr + S(g (T)) > . 

Jt+e J 

Suppose that {u* (r) : r G [t, T]} is an optimal control when starting in state 
g at time t, and denote by {P r (t, g) : r G [t, T]} the corresponding state dynam- 
ics g* (r) = P r (t,g), P t = g. Bellman's optimality principle observes 
that the control {u* (r) : r G [t + e, T]} will then be optimal when starting from 
g* (t + e) at the later time t + e. It therefore follows that 



S (*,<?)= inf ( / t+£ C(r, 
{«(»■)} Lit 



u (r) , q (r)) dr + S (t + e, g (i + e)) 



For £ small we expect that g(t + e) = g + v (t, u (t) , g) e + o (e) and provided 
that S is sufficiently smooth we may make the Taylor expansion 



S(t + e,g{t + e)) 



d 

l + e— + e(v(t,u (t) ,g),5) 



S(i, g) + o(e) 



(20) 



In addition, we approximate 

rt+e 



C (r, u (r) , g (r)) dr = eC (t, u (t) , g) + o (e) 



and conclude that 



S (t, g) = inf 



1 + e [C{t,u,g) + — + {v (t, u, g) , 5) 



S(t,g)\+o(e) 



where now the infimum is taken over the point-value of u (t) — u G U . In the 
limit e — > 0, one obtains the equation 

(t, g) + inf {C (t, u, g) + (« (t, u, g) , VS (t, g))} = 0. (21) 

Ot u&A 

The equation is then to be solved subject to the terminal condition 

S (T, g) = S (g) . (22) 

We may introduce the Pontryagin Hamiltonian function on [0, T] x S x A 
defined by the Legendre-Frenchel transform 



TL V (t, g, X) := sup {(v (t, u, g) ,XI — X) — C (t, u, g)} . 
ueu 



(23) 
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(which in fact does not depend on A 6 C since (v (t, u, g) , I) =0). It should be 
emphasized that these Hamiltonians are purely classical devices which may be 
called super-Hamiltonians to be distinguished from H. We may then rewrite 
l|21|) as the (backward) Hamilton- Jacobi equation 

^- t S(t, g )=H v (t,Q,5S(t, g )). (24) 

The operator-valued function X (t, g) = SS (t, g) satisfying then the equation 
j^X = STt v (t, p, X) is referred to as the co-state, with the terminal condition 
X (T, g) = SS (g). We remark that, if u* (t, g, X) is an optimal control minimiz- 
ing 

JC V (t, u, g, X) = (v (t, u, g) , XI — X) — C (t, u, g) , 

then the corresponding state dynamical equation -^g — v (t,u* (t, g, X) , g) in 
terms of its optimal solution P t = P t (to, g ) corresponding to P ta = g* (i ) = f?o 
can be written as P — — VqH v (t, P, Q) noting that 

H v (t, P, Q) = (v (t, u* (t, P,Q) ,P) ,XI - Q) - C (t, u* (t, P, Q).P), 

where Q t = X (t) is the solution X (t) = Q t (T,S) of Q = V P H v {t,P,Q) 
corresponding to Qt — SS (g) = S. Thus we may equivalently consider the 
system of Hamiltonian equations 

f Pt + v Q n v (t,p t ,Qt) = o, ( 5) 

\ Qt-VpH v {t,P t ,Q t ) = 0. [ ' 

which we refer to as the Hamilton- Pontry agin equations, in direct analogy with 
the classical case j^. If we set u* — u* (t,P,Q) such that K, v (t,u*,P,Q) = 
s\xp uGU 1C V (t,u, P,Q), then the Pontryagin maximum principle is the observa- 
tion that, for state and co-state {P} and {Q} respectively leading to optimality, 
we will have )C V (t, u, P, Q) < H v (t, P, Q) with equality for u — u* (t, P, Q) 
maximizing JC V (t,u, P,Q). 



4.2 Bellman Equation for Filtered Dynamics 

We now consider the stochastic differential equation JSJ for the filtered state in 
place of the master equation @ . This time, the cost is random and we consider 
the problem of computing the minimum average cost as in (|19|l . The Bellman 
principle can however be applied once more. As before, we let {u* (i)} be a 
stochastic adapted control leading to optimality and let g* (r) = P r ^ (t, g) be 
the corresponding state trajectory (now a stochastic process) starting from g at 
time t. Again choosing t < t + s < T, we have by the Bellman principle 

E [S (t + e, Q*. (t + e))] = S (i, g) 

+ inf E ( H (i, g) + C (t, u,g) + D [t, u, g)S{t,g)\e + o (e) 
u&a I at I 
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Taking the limit e — > yields the diffusive Bellman equation 
dS 

— + inf {C (t, u,g)+D (t, u, g) S (t, g)} = 0. 
at ueu 

This equation to be solved backward with the terminal condition S (T, g) = S(g). 
Using the super-Hamiltonian function 

H w (t, g, X) := sup {(w (t, u,g),XI-X)-C (t, u, g)} 
ueu 

this can be written either in the Hamilton- Jacobi form as 
dS 1 

-g- t + 2 {o{q)®o(q),{5®6)S) = H w (t,g,5S). (26) 

5 Stochastic Hamilton-Jacobi-Bellman Equation 

An alternative approach to deriving the equation 1|26[) will now be formulated. 
First of all we make a Wong-Zakai approximation 30 to the Stratonovich fil- 
tering equation (|12l> . This is achieved by introducing a differentiable process 
(t) = J*g uj^ (r) dr converging to the Wiener noise W u (t) as A — > al- 
most surely and uniformly for t G [0, T]. We may then expect the same type of 

convergence for ig^ (t)j, the solution to the random ODE 

j t g^ (t) =v(t,u (t) , e ™ (*)) + a (gW ( t) ) ^ (t) 

with non random initial condition g& (to) = Qq, to the solution {g^ (t)} with 
the same initial data g^ (to) = g . 

If we fix the output u £ f2, then we have an equivalent non-random dy- 
namical system for which we will have a minimal cost function and we denote 
this 

(to,6o)- Note that this depends on the assumed realization of the 
measurement output process and on the approximation parameter A. The HJB 
equation for S^' (t, g) will be (|24|l with v (i) now replaced by v (t) + au>^ (t): 

lsW + (a(g),SSW)^(t)=H v (t,g,6S^) 

Since a(g)u>^ (t) doesn't depend on u, the corresponding optimal strategy 
u* (t) as the solution of the optimization problem 

inf {C (u, g) + (v [u, g) + a (g) W ' A >,l)} = (o (g) u w ,X^ - H v {g, X) 

is the same function u* (t, g, X) of g = g^ (t) and X = S5^\ independent of 
a/ A ) (t). In the limit A — > we obtain the Stratonovich SDE 

dS w (t, g) + (a (g) , SS^ (t, g)) o dW (t) = H v (t, g, <5S W (t, g))dt. (27) 

which may be called a stochastic Hamilton-Jacobi-Bellman equation. 
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5.1 Interpretation of the Stochastic HJB equation 

The expression S u (t , g ) gives the optimal cost from start time to to terminal 
time T when we begin in state g and have measurement output w G O. It 
evidently depends on the information {u (r) : r £ [to, T]} only and is statistically 
independent of the noise W (t) — Lo t prior to time to- In this sense, the stochastic 
action S w [t, g) is backward-adapted. This point is of crucial importance: it 
means that the stochastic Hamiltonian-Jacobi-Bellman theory is not related 
directly to the stochastic Hamilton- Jacobi theory [2E| where both the state and 
the action are always taken as be forward-adapted; it also means that we need 
to be careful when converting l|27[) to Ito form. This is a direct consequence of 
the fact that Bellman's principle works by backward induction. 
Let us introduce the following time-reversed notations 

t := T-t, W (r) ;= W (T — t) = W (t) and S u (r, g) := S w (T -t,q) = S w (t, g) 

The process r i— > S. (r, g) is forward adapted to the filtration generated by W: 
that is S. (t, g) is measurable with respect to the sigma algebra generated by 
{W (a): a G [0, r]}. Note that the Ito differential dW (r) = W (r + e) - W (r) 
coincides with W \t-e) -W {t) = ~dW (f) for t = T - r = f. 

Theorem 1 The stochastic process {S, (<, g) : t 6 [0,T]} satisfies the backward 
ltd SDE 

dS. + i (a, V {a, VS.)) dt + (a, VS.) dW = H v {t, g, VS.) dt (28) 

where dW (t) :—W (t) — W (t — dt) is the past-pointing Ito differential. 

Proof. For simplicity we suppress the g dependences. We shall take e > 
to be infinitesimal and recast (|27|l in the form 



S. [t+-e -S. [t- -e 



-H v {t, 6S. (t))e 



(a,6S. (t)> 



II' \t+-e 



W\t--e 



o(e). 



In time-reversed notations, this becomes 



S. I T - ~£ I - S. 



a,5S, (t) 



W \r--e 



-H v [t,-SS. (r) £ 



W 



1 

r+ 5 . 



o(e) , 



where 7i„ (t, g, X) = ?i„ (t, p, — X). We then have the forward-time equation 



S. 



-£ I - S. I T - ^£ 



W„ t,-5S. (t) e 



o-,<5S. (r) 



IT' 



-e -W r--e 



o(e), 
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and using the Ito-Stratonovich transformation 



1 



W [t+ -e) -W \t- -e 



a,5S (t) 



a, SS (r) ) W (r + e) - W (r) 



1 



<j,5(<j,6S(t)))s 



we get by substitution 



a, 8 (<t,6S (t)}) e 



+H V (t, -SS (r)) s + (Jcr, SS (r) J) \w (r + e) - W (r)l = o (e) 
or, in the backward form for the original S. (t, g) — S, (T — t, g), 



1 



(<7, <5 (<7, <5S (t)))e 



-W w (i, <JS (f)) e + (<r, (SS (t)) [W (t) -W(jt-e)]=o (e) . 
In the differential form this clearly is the same as l|28|) . ■ 

If we denote by E^ 0,e o^ expectation (conditional on g u (i ) = p ), then 
E(*o,e ) (a,SS, (t)) dW (t) = since S. (t), and its derivatives, are indepen- 
dent of the mean-zero past-point Ito differentials. We then have as a corollary 
that the averaged cost S (t, g) defined by 

S(*o,eo) :=E (to ^ o) [S. (t ,g )] 

will satisfy the equivalent diffusive Hamilton- Jacobi equation 



dS 1 

— + -(a,5(a,SS))=H v (t,g,5S) 



(29) 



which is the Hormander form of the Bellman equation for optimal cost S (t, g). 



6 Linear-State Cost 

A special case is applied to quantum mechanics when C (t, u, g) and S (g) are 
both linear l|18fl in the state g with quadratic dependence of C on u. Let us 
specify for simplicity to a time-independent cost observable with control pa- 
rameter u = (u 1 , • • ■ , u n j S K™ and having a quadratic dependence of the form 
(Einstein index notation!) 

C{u) = \g a pu a u p + u a F a + C 
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where {g a p) are the components of a symmetric positive definite metric with 
inverse denoted (g a ^) and Fi, ■ ■ ■ ,F n ,Co are fixed bounded operators. We take 
control Hamiltonian operator to be 

H(u) = u a V a 

where Vi, ■ ■ ■ ,V n are fixed bounded observables. Our aim is to find the optimal 
value u* for each pair (P, Q) giving a minimum to (P, C (u)) + (w (t, u, P) ,Q) = 
—K. w (t, u, P, Q): we will have 

= -^{(P,C(u)) + (w(t,u,P),Q}} 
= g a puP + {P,F a ) + {i[P,V a ],Q). 
Thus the optimal control u* (P, Q) is given by the components 



-g a0 (p,F + ±[Q,V P ]} 



This yields a unique point of infimum and on substituting we determine that 

H W (P,Q) = ~g a e(p,F a + ±[Q,V a }j(p,Fp + ±[Q,V } 
-(P,C + C R (Q)+C L (Q)>. 
As a result, the Hamilton- Jacobi-Bellman equation takes the form 

§ - \g aP (q> f °< + \ [* s * V«]) (e> F e + ] ^ v ^ 

+ (g, Co + £r (SS) + C L (SS)) + i (a (g) <8> a (g) ,(5®8)S) = 0. 
The terminal condition being that S(g,T) — {g,S}. 



6.1 Controlled Qubit 

Let us illustrate the above for the case of a qubit (two-state system). The 
problem we consider is similar to the one formulated in . Denoting the Pauli 
spin vector by <f = (s x , <; y , c z ) with 

?x = ( 1 o)'^ = (i o ) ' ^ = ( o -1 ) ' 

we may represent each state by polarization vector r€R as 

where \p\ < 1, while any observable takes the form 

Q = Qo + q-? 
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and we have the duality (g,Q) = <?o + q-P- We shall write p = (x,y,z) and 

q= (qx,q y ,Pz)- 

Let us suppose that we have maximal control of the Hamilton component of 
the dynamics, that is, we set 

H (u) = iu.r 

with control variable u E M 3 . We also ignore the effect of the environment and 
take Cr = 0. For simplicity, we shall take the cost to have the form 

C(t,u,g) = i |u| 2 

and we take the coupling of the system to the measurement apparatus to be 
determined by the operator 

L = -k<; z . 
2 

Explicitly we have 

(w (t, u,g),Q)=u.(pxq)- (xq x + yq y ) 

from which we see that the minimizing control is u* = q x p leading to the 
Hamiltonian function 

1 2 1 

"Hw (j>,q) = W x P\ - ^(xq x +yq y ) . 

Meanwhile, a (g) = k (gq z + q z g) — (g, 2nq z ) g and so 

(a (g) , Q) = -nzxq x - zyq y + k (l - z 2 ) q z . 

With the customary abuse of notation, we write S (t, g) = S (t, x, y, z). The 
ltd correction term, i (a (g) (g) a (g) , 5 <8> 65), in the HJB equation is then given 

by (with S xy = gf|^, etc.) 

5xx 5 x y S x 

( ZX, Zy , 1 Z ) J Sy X ^yy Sy£ 

Putting everything together, we find that the Hamilton- Jacobi-Bellman equa- 
tion is 




as 1 
= at—2 



qx VS 



2 _i / as ds 

2 \ dx ^ dy 



k 2 ( 2 2 d 2 s 2 2 a 2 s , 2 , 2 a 2 s 
2 ^ 2 s / ^ a 2 s , ~ a 2 s 

+ x y z 7TJT - X H 1 - z ) 7T7T ~ V z I 1 ~ z ) 



dxdy dxdz dydz J 
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7 Discussion 



In our analysis we have sought to think of the quantum state of a controlled 
system (that is, its von Neumann density matrix) in the same spirit as classical 
control engineers think about the state of the system. The advantage of this 
is that all the quantum features of the problem are essentially tied up in the 
state: once the measurements have been performed the information obtained 
can be treated as essentially classical, as can the problem of using this infor- 
mation to control the system in an optimal manner. The disadvantage is that 
we have to deal with a stochastic differential equation on the infinite dimen- 
sional space of quantum states. Nevertheless, the Bellman principle can then 
be applied in much the same spirit as for classical states and we are able to 
derive the corresponding Hamilton- Jacobi-Bcllman theory for a wider class of 
cost functionals than traditionally considered in the literature. When restricted 
to a finite-dimensional representation of the state (on the Bloch sphere for the 
qubit) with the cost being a quantum expectation, we recover the class of Bell- 
man equations encountered as standard in quantum feedback control. 
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